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(54) Tide: VIEWER FOR OPTICAL FLOW THROUGH A 3D TIME SEQUENCE 
(57) Abstract 

A three-dimensional viewing technique that allows an 
operator to visualize the result of optical flow analysis through 
a sequence of images. The technique builds a track representing 
the movement of each feature point in a sequence of images 
and furthermore builds such a track for each feature point, to 
sub-pixel accuracy as desired. The tracks are are displayed 
in a 3D coordinate system whereby the x and y coordinates 
correspond to the coordinates of the feature in the image 
coordinate system and the z coordinate is a number associated 
with the temporal ordering of each image in the sequence. The 
resulting track display represents the evolution of the optical flow 
over time. 
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VIEWER FOR OPTICAL FLOW THROUGH A 3D TIME SEQUENCE 

FIELD OF THE INVENTION 

The present invention relates to computer image 
processing and in particular to a technique for 
visualizing feature tracks and identifying errors and 
5 anomalies therein prior to subsequent processing. 

BACKGROUND 

An image processing function called feature 
tracking is the process of selecting features from an 

10 initial scene and then tracking these features across a 
related series of images of the same scene. Each image 
is typically represented as an array of pixel values, 
and a feature point in such an image is typically 
identified as a region of one or more pixels (or 

15 sub-pixels) . 

Feature tracking is the basis for several 
techniques whereby multiple feature points are 
simultaneously tracked across related image frames to 
develop further information about this scene. These 

2 0 include techniques for tracking two-dimensional shapes 
across frames, for estimating three-dimensional paths 
of selected feature points, for estimating 
three-dimensional camera paths, and for recovering 
estimated three-dimensional scene structure (including 

25 estimated depths of object surfaces). The use of 

feature tracking techniques in these applications can 
be very powerful, because they transform an image 
processing problem into a domain where geometric 
constraints can be applied. 

30 Most feature tracking methods are highly sensitive 

to the initial selection of each feature point. 
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Automated feature point selection is typically done 

using criteria applied solely to the initial frame » 
(such as choosing an area of high contrast) . This - 
selection can easily prove to be a poor choice for 
5 tracking in successive frames. Like-wise, a manual 
selection made by a human operator may not be well 
suited for tracking over multiple frames. 

When features are tracked independently, selection ' 
sensitivity becomes critical. Even when multiple 
10 features can be correlated and tracked as a group, 

reducing selection sensitivity depends on tracking all 

the features across multiple image frames while . 

maintaining the correlation between theiri. 

A feature can be "lost" due to imaging artifacts 
15 such as noise or transient lighting conditions. These 

artifacts can make it difficult or impossible to * 

distinguish the feature identified in one frame from 

its surroundings in another frame. A feature can also 

be lost when it is visible in one frame but occluded 
20 (or partially occluded) in another. Feature occlusion 

may be due to changing camera orientation, and/or 

movement of one or more object (s) in the visual scene. 
A lost feature can reappear in yet another frame, 

but not be recognized as a continuation of a previously 
25 identified feature. This feature might be ignored, and 

remain lost. It may instead be incorrectly identified 

and tracked as an entirely new feature, creating a 

"broken path." A broken path has two (or more) 

discontinuous segments such that one path ends where 
30 the feature was lost, and the next path begins where 

the feature reappears. A single feature may therefore 

be erroneously tracked as multiple unrelated and 
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independent features, each with its own unique piece of 
the broken path. 

The above conditions that lead to a lost feature 
can also contribute to a "bad match." A bad match is a 
5 feature identified in one frame that is incorrectly 
matched to a different feature in another frame. A bad 
match can be even more troublesome than a lost feature 
or broken path, since the feature tracking algorithm 
proceeds as if the feature were being correctly 
10 tracked. 

SUMMARY OF THE INVENTION 

The advantages of feature tracking have been 
demonstrated in experimental results and in field 

15 trials, particularly in applications that derive higher 
level scene information by automatically tracking and 
correlating multiple feature points. However, the 
limitations of feature tracking methods as discussed 
above reduce their utility in certain practical 

20 settings. A tool that would enable a user to visualize 
the output of feature tracking to identify bad matches 
or other instances in which movement is being tracking 
incorrectly could greatly improve the utility of 
automatic feature tracking. 

25 It is also desirable to eliminate erroneous tracks 

and to correct anomalies in tracks as much as possible 
prior to their being input to automatic camera and 
scene modeling algorithms, because their presence in 
the tracking data causes errors in resulting 

30 computations. 

Briefly, the present invention is a visualization 
tool that displays the output of a feature tracking-or 
optical flow algorithm in a type of three-dimensional 
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"spaghetti graph." The spaghetti graph enables a human 
user to identify and eliminate outliers and other bad 
track matches from the results of a feature tracking 
algorithm performed on the original 2D image sequence. 
5 The technique involves building a track in three 

dimensions representing the movement of a single 
feature through the sequence of images, and furthermore 
builds a track for any number of features in the 
sequence. The display provides a representation of the 

10 tracks in a 3D coordinate system where the x and y 

coordinates are the coordinates of a feature within the 
image coordinate system, and the z coordinate is a 
number associated with the temporal ordering of each 
image frame in the sequence of images. 

15 The tracks are preferably marked with an attribute 

of a selected pixel in the feature in the originating 
image in order to further allow the user to visually 
separate the tracks. For example, the marked track may 
be colored in the same color as the selected pixel in 

20 the case of a color image, or set to a corresponding 
grey scale value in the case of a black and white 
image . 

The result is a three-dimensional display of 
marked tracks representing the evolution of the optical 

25 flow over time. The 3D track representation may be 
manipulated by rotation, scaling, zooming, viewpoint 
modification, and other standard 3D viewer tools which 
permit the user to view a 3D object from various angles 
on a 2D computer monitor. This permits the user to 

30 identify problem areas such as broken paths or lost 

features indicated by places in the graph where tracks 
are not smooth, tracks end or begin abruptly, tracks- 
cross one another, or have other anomalies. 
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The graph may therefore be used to evaluate the 
quality of different feature tracking runs and/or 
algorithms . 

The invention provides further benefits in terms 
5 of producing feature tracking outputs which are of 

greater accuracy by eliminating the very features which 
cause most errors in computations. For example, once 
problem areas in the optical flow are identified and/or 
corrected, the user can rerun feature tracking 
10 algorithms or improve their results by excluding 
problem bad tracks or outliers from camera path or 
scene model analysis. 

BRIEF DESCRIPTION OF THE DRAWINGS 
15 The file of this patent contains at least one 

drawing executed in color. Copies of this patent with 
color drawing (s) will be provided by the Patent and 
Trademark Office upon request and payment of the 
necessary fee. 

20 Fig. 1 is a block diagram of an image processing 

system in which a feature track visualization technique 

may be used according to the invention. 

Fig. 2 is a more detailed view of a sequence of 

images and a feature point generation process showing 
25 their interaction with a feature tracking, scene 

modeling, and camera modeling process. 

Fig. 3 is an exemplary view of a camera, its image 

plane, and the derivation of feature points, scene 

structure and camera models. 
30 Fig. 4 is a flow chart of a sequence of steps 

performed in order to produce a feature track 

visualization according to the invention. 
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Fig. 5 is a set of steps that may be performed 
subsequent to the visualization process of Fig. 4 to 
identify and remove bad tracks or anomalies from 
subsequent processing. 
5 Fig. 6 is an exemplary first image from a sequence 

of images . 

Fig. 7 is an exemplary 3D feature track 
visualization. 

Fig. 8 is the same feature track visualization but 
10 viewed from a second viewpoint with a higher zoom 
factor, illustrating a bad track having an anomaly. 

Fig. 9 is an even closer view illustrating a 
broken track. 

15 DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT 
Turning attention now in particular to the 
drawings, Fig. 1 is a block diagram of the components 
of a digital image processing system 10 in which a 
feature track visualization technique according to the 

20 invention may be implemented. The system 10 includes a 
computer workstation 20, a computer monitor 21, and 
input devices such as a keyboard 22 and mouse or stylus 
23. The workstation 20 also includes input/output 
interfaces 24, storage 25, such as a disk 26 and random 

25 access memory 27, as well as one or more processors 28. 
The workstation 20 may be a computer graphics 
workstation such as the 02/Octane sold by Silicon 
Graphics, Inc., a Windows NT type-work station, or 
other suitable computer or computers. The computer 

30 monitor 21, keyboard 22, mouse or stylus 23, and other 
input devices are used to interact with various 
software elements of the system existing in the 
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workstation 2 0 to cause programs to be run and data to 

be stored as described below. 

The system 10 also includes a number of other 

hardware elements typical of an image processing 
5 system, such as a video monitor 30, audio monitors 31, 

hardware accelerator 32, and user input devices 33. 

Also included are image capture devices, such as a 

video cassette recorder (VCR) , video tape recorder 
(VTR) , and/or digital disk recorder 34 (DDR) , cameras 
10 35, and/or film scanner/telecine 36. Sensors 38 may 

also provide information about the scene and image 

capture devices. 

One aspect of the present invention is concerned 

with a technique for visualizing an array of feature 
15 points derived from a sequence of images provided by 

one of the image capture devices. As shown in Fig. 2, 

a sequence 50 of images 51-1, 51-2, 51-F are 

provided to a feature point generation process 54 . For 

example, the images 51 may be provided at a Dl 
20 resolution of 720 by 4 86 pixels. Each entry in the 

feature array 58, however, may actually represent a 

feature selected over the tiled image 51, such as over 

a 5x5 or a 7x7 pixel tile. 

An output of the feature point generation process 
25 54 a set of arrays 58-1, 58-2, 58-F of feature 

points with typically an array 58 for each input image 

51. 

As a result of creating the feature point arrays 
58, a feature track process 61, a scene structure 
30 modeling process 62, a camera modeling process 63, or 
other image processing techniques may be applied to 
derive further information from the image sequence SO. 
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Feature tracking 61 may, for example, estimate the 
path or "directional flow" of two-dimensional shapes 
across the image sequence 50, or estimate three- 
dimensional paths of selected feature points. The 
5 scene structure model 62 may derive information about 
the relative distances or "depth" of objects in the 
image sequence 50. The camera modeling processes 63 
may estimate one or more camera paths in three 
dimensions from multiple feature points. 

10 Considering the scene structure modeling 62 and 

camera modeling 63 more particularly, the sequence 50 
of images 51-1, and 51-2, 51-F is typically taken 

from a camera that is moving relative to objects in a 
scene. Imagine that we locate P feature points 52 in 

15 the first image 51-1. Feature points 52 are often 

selected to be the corners of objects in the images 51, 
although other selection methods may be used. Each 
feature point 52 corresponds to a single world point, 
located at position s p in some fixed world coordinate 

2 0 system. This point will appear at varying positions in 
each of the following images 51-2, . .., 51-F, depending 
on the position and orientation of the camera in that 
image, and depending upon whether the point moves or 
remains fixed over time in world coordinates relative 

25 to the camera. 

The observed image position of point p in frame f 
is written as the two-vector u fp containing its . image x- 
and y- coordinates, which is sometimes written as 
(u fp/ v fp ) . These image positions are measured by 

30 tracking the feature from frame to frame using known 
feature tracking techniques . 

The camera position and orientation in each frame 
is described by a rotation matrix R f and a translation 
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vector t f representing the transformation from world 
coordinates to camera coordinates in each frame. It is 
possible to physically interpret the rows of R f as 
giving the orientation of the camera axes in each frame 
5 - the first row i f/ gives the orientation of the 
camera's x-axis, the second row, j f/ gives the 
orientation of the camera's y-axis, and the third row, 
k f/ gives the orientation of the camera's optical axis, 
which points along the camera's line of sight. The 
10 vector t f indicates the position of the camera in each 
frame by pointing from the world origin to the camera's 
focal point. This formulation is illustrated in Fig. 
3 . 

The process of projecting a three-dimensional 

15 point onto the image plane in a given frame is referred 
to as projection. This process models the physical 
process by which light from a point in. the world is 
focused on the camera's image plane, and mathematical 
projection models of various degrees of sophistication 

20 can be used to compute the expected or predicted image 
positions P(f,p) as a function of s p/ R f/ and t f . In 
fact, this process depends not only on the position of 
a point and the position and orientation of the camera, 
but also on the complex lens optics and image 

25 digitization characteristics. These may include an 
orthographic projection model, scaled orthographic 
projection model, para-perspective projection model, 
perspective projection model, radial projection model, 
or other types of models. These models have varying 

30 degrees of mathematical sophistication and complexity, 
and account for the actual physics of image formation 
to increasingly accurate degrees. 
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One such camera movement and surface mesh modeling 
algorithm is described in Poelman, C. J. , "The 
Paraperspective and Projective Factorization Methods 
for Recovering Shape and Motion, " Carnegie Mellon 
5 University, School of Computer Science Report CMU-CS- 
95-173 dated 12 July 1995. 

The specific algorithms used to derive a scene 
structure 62 or camera model 63 are not of particular 
importance to the present invention. Rather, the 

10 present invention is concerned with a technique for 
developing a visual representation of the arrays of 
feature points 58 to better permit identification of 
errors and anomalies therein. 

The feature points developed from the image 

15 sequence 50 are stored in the feature array 58 as a 
number of associated image feature entries 60. For 
example, each entry 60 in the feature array 58 contains 
at least (1) a grid position (GRID POS) or n (x,y) 
coordinate" and (2) a flow vector (FLOW) or "path." 

20 Path for the feature array 58 is developed by 

applying a feature tracking algorithm 60 across 
successive images 51. Consider an example where the 
image stream 50 contains images of a rotating cube 68 
against a uniform dark background. The visual corners 

25 52 of the cube 68 are what is traditionally detected 
and tracked as feature points. The GRID POS data for 
each feature point in image 51-1 is thus the (x,y) 
position of each feature point in the first array 58-1. 
As the image stream progresses, a second image 

30 51-2 in the sequence has the cube rotated to a 

different position. As shown, a corresponding movement 
of the feature points 52 occurs. The grid positions -of 
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the feature points are thus stored in a second array 
58-2 of the feature array 58. 

Therefore, across each image pair, a sub-pixel 
directional flow vector can be generated representing 
5 the movement of each feature point 52. The vectors are 
generated between the first 51-1 and second image 51-2, 
the second 51-2 and third 51-3 image, and so on up to 
the F'th image 51-F. 

A corresponding flow vector can thus be derived 
10 for each feature point pair which determines the 
sub-pixel location of the feature point in a next 
successive image. Data representing the flow vector 
for each feature point is stored in the PATH entries in 
feature array 58. A given directional flow vector, for 
15 example, associated with the subsequent images 51, may 
have a different magnitude and direction as the speed 
and direction of the cube 68 changes. 

Fig. 4 is a sequence of steps that can now be 
performed given that the feature array 58 containing 
20 sets of feature points and flow vectors for each image 
is available. 

From an idle state 100, a first state 102 is 
entered in which the feature track algorithm is used to 
define feature points and paths for each frame as 
25 already described. 

The following states 104 through state 110 are 
executed for each feature point array 58 . 

Likewise, beginning in state 106, a loop is 
performed for each image, f, in the array. 
30 In state 108, a track segment is built in three 

dimensions for each feature point 52 from its data 
associated with each image in the feature array 58..._In 
particular, a track segment is built in three 
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dimensions by plotting a line segment beginning at a 
location (x,y,f) where the x and y coordinates 
correspond to the relative position of the feature 
point in its associated image 51, and its location 
5 along the z axis is a number, f, associated with the 
temporal ordering or the "index" of each image 50 in 
the sequence 51 . 

Once the start coordinates is known, the line 
segment is drawn in the direction given by the 
10 corresponding path vector. 

In state 110, the track segment is actually 
rendered on the display. In particular, in this state 
110, the track is rendered in a color that is the same 
as the feature point's color in the first frame of the 
15 sequence 51. 

States 104 through 110 are iterated until a track 
is displayed representing the movement of a single 
feature point throughout an entire sequence of images 
and such a track is built for each feature point in the 
20 image. The result is a set of colored tracks 

representing the evolution of the optical flow over 
time through the image sequence 51. 

In state 112, the result is then displayed to the 
user, and the user is permitted in state 114 to change 
25 the viewpoint via rotation, zooming, and other standard 
3D viewer tools in order to evaluate the quality of the 
feature tracking algorithm. In particular, the user 
may access the quality of the particular feature 
tracking algorithm implemented to easily identify 
3 0 problems areas such as places in which the tracks are 
not smooth, tracks begin or end abruptly, tracks cross 
one another, or. have other anomalies. 
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For example, turning attention briefly to Fig. 6, 
there is shown a view of a scene in which a woman is 
seated in a room next to a fireplace. Fig. 7 shows one 
view of a feature track visualization produced from 
5 this scene according to the sequence of steps performed 
in Fig. 4. The sequence of images was taken by panning 
the camera around the seated woman in the room. The 
particular feature points can be traced more or less 
back to their origin points in the first image in the 

10 sequence by coordinating the color of the feature 

points with 1 the colors of various regions in the first 
image in the sequence. 

Fig. 8 is a viewpoint of the same set of tracks 
but taken from a closer viewpoint. Notice that one of 

15 the tracks 200 has an anomaly in that it has a sharp 
peak in a region of otherwise smooth tracks. The user 
knows this because the camera movement could not have 
possibly produced such an anomaly for only one feature 
of the image when other surrounding features in the 

20 same portion of the image exhibit much smoother flow. 

Fig. 9 is an even more detailed viewpoint of a 
track 210 which is considered to be "bad" in that there 
is an obvious break or premature end point for the 
track 210. 

25 The process of Fig. 4 may therefore be used to 

evaluate the performance of particular feature tracking 
algorithm 61. However, additional application of the 
process can be used whereby the user intervenes in 
automatic scene modeling and camera path algorithms in 

30 order to produce higher quality results. 

For example, when viewing a three-dimensional flow 
display such as that of Figs. 7, 8 or 9, the user can 
identify anomalies and other problem areas in the flow, 
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such as unsmooth tracks, tracks that appear to flow in 
physically impossible directions, crossing tracks, and 
interrupted tracks as before. Once such tracks have 
been identified, the user can alter or remove them 
5 entirely from subsequent processing in order to reduce 
the noise in the input to automatic algorithms and 
thereby improve their output. 

For example, turning attention to Fig. 5, process 
may begin from an idle state 100, performing the states 
10 102 through 112 as in Fig. 4. However, at the end of 
state 112, a state 130 may be entered in which the user 
identifies a bad track from three-dimensional displays 
such as the track that was shown in Fig. 9. 

In a state 132, this track can be identified as a 
15 track which should be removed from further analysis. 
Thus in this state, for example, an entry is made in 
the feature array 58 to indicate the status is "bad." 

Bad tracks, for example, are often found most 
likely in outlying areas of the scene, most likely a 
20 result of the fact that information on the edges of a 
particular image 51 typically change more rapidly than 
the information in the center of the image. " When the 
subsequent image processing algorithm such as a feature 
tracking algorithm 61, a camera modeling algorithm 63, 
25 or scene modeling algorithm 62 may be run without using 
such a track, with improved results. 

Similarly, from state 112, the user may enter a 
state 140 in which an anomaly in a track is identified. 
In state 142, the system 10 may permit the user to 
30 specify a correction to this particular track. This 
correction is reflected in a modification to the 
entries in the feature point array such as by modifying 
the location of an x,y point in the array or visually 
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changing its corresponding path vector with the input 
device 23. 

The corrected tracks are then applied in state 150 
to the subsequent feature track 61, camera modeling 63 , 
5 or scene modeling 62. By having the user identify 
points in the scene that appear incorrect, such as 
their position does not correspond to the user's 
understanding of the scene geometry. 

It should be understood that the processes 
10 described in Fig. 4 and Fig. 5 can then be iterated as 
indicated in state 100 through state 150 and to further 
refine the process with user input . 

EQUIVALENTS 

15 While this invention has been particularly shown 

and described with references to preferred embodiments 
thereof, it will be understood by those skilled in the 
art that various changes in form and details may be 
made therein without departing from the spirit and 

20 scope of the invention as defined by the appended 

claims. Those skilled in the art will recognize or be 
able to ascertain using no more than routine 
experimentation, many equivalents to the specific 
embodiments of the invention described specifically 

25 herein. Such equivalents are intended to be 
encompassed in the scope of the claims. 



CLAIMS 



is claimed is: 

A method for a visualization of optical flow for a 
time sequence of images comprising the steps of: 

forming a feature point array from the image 
sequence, with entries in the feature point array- 
corresponding to the coordinate positions of 
feature points in image of the array and 
associated flow vector information; and 

deriving a flow graph representation in three 
dimensions from the feature point array wherein 
coordinate positions along a first pair of 
coordinates axes correspond to coordinate 
positions in a source images of the image 
sequence, and wherein coordinate positions along 
an orthogonal depth axis of the visualization 
correspond to an index number of the corresponding 
image from which the feature point was taken. 

A method as in claim 1 wherein the flow graph 
representation for a given feature point is a 
track comprising a series of line segments 
illustrating the change in position of the feature 
point over a corresponding series of images in the 
image sequence. 

A method as in claim 2 wherein the track is marked 
with an attribute of the feature point in one of 
the images in the series. 
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4. A method as in claim 3 wherein the attribute is a 
color. 

5. A method as in claim 3 wherein the attribute is a 
5 grey scale value . 

6. A method as in claim 1 wherein the user is 
permitted to identify anomalies in the flow graph 
representation. 

10 

7. A method as in claim 6 wherein the anomalies are 
used to control inputs to a subsequent automatic 
image processing algorithm. 

15 8. A method as in claim 6 wherein the anomalies 

include locations in the flow graph representation 
which end abruptly indicating where a feature 
track was lost. 

20 9. A method as in claim 8 wherein the lost feature 
track is excluded from the subsequent image 
processing algorithm. 

10. A method as in claim 8 wherein the user supplies 
25 an input indicating how the lost feature track can 

be recovered by stitching it to another feature 
track. 
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